Organization: Computer Science, University of B.C., Vancouver, B.C., Canada
Message-ID: <4ejb66INN111@keats.ugrad.cs.ubc.ca>
References: <4eh2ve$4u2@mark.ucdavis.edu>
NNTP-Posting-Host: keats.ugrad.cs.ubc.ca
In article <4eh2ve$4u2@mark.ucdavis.edu>,
Ricardo E Espinoza-Ibarra <espinoza@cs.ucdavis.edu> wrote:
>Does anyone know what the -O option in gcc C compiler does? I have tried to compile a program in two different ways: 1) cc -o first first.c, and it worksfine; 2) cc -o first first.c -O, and it runs faster than the other one. I checked to see if it had changed the program code in anyway, and it hadn't.
>This option stands for "optimizer", but I want to know how the heck it optimizes the program! Please respond as soon as possible.
It optimizes by generating different machine code for the same C code. Machine
code which produces the same computation but using fewer CPU cycles. Sometimes
optimized code is longer than unoptimized, but usually it is shorter.
Code can be optimized when it is still in an intermediate representation. In
fact, it has to be optimized, because the initial code that is generated from
the C source has all kinds of redundant instructions.
One of the most common things that is done is to prune instructions that
calculate duplicate results, results which are never used and such.
Optimization also involves reducing the strength of operations. For example, a
division of an integer by two can be done by a shift right.
There is also machine-specific optimization, which is typically done by a
"peephole" method. A narrow window of instructions from each basic block is
scanned top to bottom. If the sequence in the "peephole" matches the criteria
of some sort of pattern, it is replaced by an equivalent sequence that is
faster.
Also, the way registers are allocated during target code generation is also
important. When a compiler is told to optimize, it should also try to be smart
about how it allocates registers to variables.
The GNU compiler has a module called "stupid.c" which performs "stupid"
register allocation, and is only invoked when you _don't_ use the -O flag.
There are many clever things that an optimizer can do. Compiler designers are
endlessly creative about it. One optimization, for instance, is known as 'jump
threading'. If the target of one jump instruction is another jump instruction,
the original jump is redirected to go directly to the target. Such code can
arise from nested block statements in the original code. For example, an inner
loop can exit by jumping to the instruction that follows the loop. But that
instruction can be the last statement of an outer loop which jumps back to the
top. Hence the inner jump should just go right for the top. The initial
translation stages might not catch something like that, since they are driven
by the syntax of the nested block.
To see the difference between -O and not -O, try using the -S flag to generate
intermediate assembly. Use GCC, if you can. The default system compiler might